A Kana-Kanji Translation System for Non-Segmented Input Sentences Based on Syntactic and Semantic Analysis

نویسندگان

  • Masahiro Abe
  • Yoshimitsu Ooshima
  • Katsuhiko Yuura
  • Nobuyuki Takeichi
چکیده

This paper presents a disambiguation approach for t ransla t ing non-segmented-Kana into Kanji. The method consists of two steps. In the first step, an input sentence is analyzed morphologically and ambiguous morphemes are stored in a network form. In the second step, the best path, which is a string of morphemes, is selected by syntactic and semantic analysis based on case grammar. In order to avoid the combinatorial explosion of possible paths, the following heuristic search method is adopted. First, a path that contains the smallest number of weighted-morphemes is chosen as the quasi-best path by a best-first-search technique. Next, the restricted range of morphemes near the quasi-best path is extracted from the morpheme network to construct preferential paths. An experimental system incorporating large dictionaries has been developed and evaluated. A translat ion accracy of 90.5% was obtained. This can be improved to about 95% by optimizing the dictionaries.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Automatic Translation System Of Non-Segmented Kana Sentences Into Kanji-Kana Sentences

This paper p r e s e n t s t h e a l g o r i t h m s t o s o l v e t h e two main problems compr ised i n t he au tomat ic Kana-KanJi t r a n s l a t i o n sys tem, in which the i n p u t s e n t e n c e s in Kana a re t r a n s l a t e d i n t o o r d i n a r y Japanese s e n t e n c e s i n Kanj i and Kana : t he s e g m e n t a t i o n o f non-segmented s e n t e n c e s i n t o Bunsetsu and...

متن کامل

Collocational analysis in Japanese text input

This paper proposes a new disambiguation method for Japanese text input. This method evaluates candidate sentences by measuring the number of Word Co-occurrence Patterns (WCP) included in the candidate sentences. An automatic WCP extraction method is also developed. An extraction experiment using the example sentences from dictionaries confirms that WCP can be collected automaticMly with an acc...

متن کامل

برچسب‌زنی خودکار نقش‌های معنایی در جملات فارسی به کمک درخت‌های وابستگی

Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...

متن کامل

برچسب‌زنی نقش معنایی جملات فارسی با رویکرد یادگیری مبتنی بر حافظه

Abstract Extracting semantic roles is one of the major steps in representing text meaning. It refers to finding the semantic relations between a predicate and syntactic constituents in a sentence. In this paper we present a semantic role labeling system for Persian, using memory-based learning model and standard features. Our proposed system implements a two-phase architecture to first identify...

متن کامل

An Evaluation To Detect And Correct Erroneous Characters Wrongly Substituted, Deleted And Inserted In Japanese And English Sentences Using Markov Models

K e y words: Markov model, error detection, error correction, bunsetsu, substitution, deletion, insertion 1 I n t r o d u c t i o n In order to improve the man-machine interface with computers, the <tevelopment of input devices such as optical cha.racter tea<lets (OCR) or speech recognition devices are expected, llowew;r, it is not easy to input Japanese sentences J)y these devices, because. th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1986